character set
Standard character sets contain 128 standard characters, numbered 0 to 127, encoded in one
8-bit byte. Extended character sets contain an additional 128 extended characters,
numbered 128 to 255, encoded in the same 8-bit byte.
Source programs contain only the standard character set characters. Extended
characters are discarded. All language elements are composed of printable standard
characters, plus three whitespace characters, space, tab, newline. - which have values
0x20, 0x09, 0x0A.
To programs, however, characters are unsigned bytes. How these bytes are interpreted
depends on the programs, though many built-in intrinsic functions assume the standard
character set.
Certain groups of characters are referred to by the following names:
Alphabetic "A - Z", "a - z"
Alphanumeric "A - Z", "a - z", "0 - 9"
Numeric "0 - 9"
Binary "0 - 1"
Octal "0 - 7"
Hexadecimal "0 - 9", "A - F", "a - f"
Symbol Characters "A - Z", "a - z", "0 - 9"
Type Suffixes @ @@ % %% & && ~ ! # $$ $
Scope Prefixes # ##
parse method
To parse means to break program text into language elements. For example,
thisVar=thatVar parses into three language elements:
thisVar - variable
= - assignment operator
thatVar - variable
The following process is performed to find each language element. First, leading
whitespace is ignored. Then, successive characters are collected until adding the
next character would produce an invalid language element.
Whitespace separates language elements that would otherwise be inappropriately combined
into one. For example, FORK=ATOM means "assign variable ATOM to variable
FORK". In contrast, most conventional BASIC languages interpret FORK=ATOM to
mean FOR K = A TO M. To write the FOR statement requires the FOR and TO keywords be
separated from the adjacent variables.